rate-distortion function
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (3 more...)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.47)
517da335fd0ec2f4a25ea139d5494163-Paper.pdf
Itisoften the responsibility of the agent designer toconstruct thistargetwhich,inrichandcomplexenvironments,constitutesaonerousburden; without full knowledge of the environment itself, a designer may forge a suboptimal learning target that poorly balances the amount ofinformation an agent must acquire to identify the target against the target's associated performance shortfall.
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Estimating the Rate-Distortion Function by Wasserstein Gradient Descent
In the theory of lossy compression, the rate-distortion (R-D) function $R(D)$ describes how much a data source can be compressed (in bit-rate) at any given level of fidelity (distortion). Obtaining $R(D)$ for a given data source establishes the fundamental performance limit for all compression algorithms. We propose a new method to estimate $R(D)$ from the perspective of optimal transport. Unlike the classic Blahut--Arimoto algorithm which fixes the support of the reproduction distribution in advance, our Wasserstein gradient descent algorithm learns the support of the optimal reproduction distribution by moving particles. We prove its local convergence and analyze the sample complexity of our R-D estimator based on a connection to entropic optimal transport. Experimentally, we obtain comparable or tighter bounds than state-of-the-art neural network methods on low-rate sources while requiring considerably less tuning and computation effort. We also highlight a connection to maximum-likelihood deconvolution and introduce a new class of sources that can be used as test cases with known solutions to the R-D problem.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (3 more...)
An Information-Theoretic Framework for Deep Learning
Each year, deep learning demonstrates new and improved empirical results with deeper and wider neural networks. Meanwhile, with existing theoretical frameworks, it is difficult to analyze networks deeper than two layers without resorting to counting parameters or encountering sample complexity bounds that are exponential in depth. Perhaps it may be fruitful to try to analyze modern machine learning under a different lens. In this paper, we propose a novel information-theoretic framework with its own notions of regret and sample complexity for analyzing the data requirements of machine learning. We use this framework to study the sample complexity of learning from data generated by deep ReLU neural networks and deep networks that are infinitely wide but have a bounded sum of weights. We establish that the sample complexity of learning under these data generating processes is at most linear and quadratic, respectively, in network depth.
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.47)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks
Lyu, Zhonghao, Xiao, Ming, Xu, Jie, Skoglund, Mikael, Di Renzo, Marco
The growing demand for large artificial intelligence model (LAIM) services is driving a paradigm shift from traditional cloud-based inference to edge-based inference for low-latency, privacy-preserving applications. In particular, edge-device co-inference, which partitions LAIMs between edge devices and servers, has emerged as a promising strategy for resource-efficient LAIM execution in wireless networks. In this paper, we investigate a pruning-aware LAIM co-inference scheme, where a pre-trained LAIM is pruned and partitioned into on-device and on-server sub-models for deployment. For analysis, we first prove that the LAIM output distortion is upper bounded by its parameter distortion. Then, we derive a lower bound on parameter distortion via rate-distortion theory, analytically capturing the relationship between pruning ratio and co-inference performance. Next, based on the analytical results, we formulate an LAIM co-inference distortion bound minimization problem by jointly optimizing the pruning ratio, transmit power, and computation frequency under system latency, energy, and available resource constraints. Moreover, we propose an efficient algorithm to tackle the considered highly non-convex problem. Finally, extensive simulations demonstrate the effectiveness of the proposed design. In particular, model parameter distortion is shown to provide a reliable bound on output distortion. Also, the proposed joint pruning ratio and resource management design achieves superior performance in balancing trade-offs among inference performance, system latency, and energy consumption compared with benchmark schemes, such as fully on-device and on-server inference. Moreover, the split point is shown to play a critical role in system performance optimization under heterogeneous and resource-limited edge environments.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > France (0.04)
- (2 more...)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)